Tailored Similarity Spaces for the Prediction of Physicochemical Properties

نویسندگان

  • Ovidiu Ivanciuc
  • Mircea V. Diudea
  • Brian D. Gute
  • Subhash C. Basak
  • Denise Mills
  • Douglas M. Hawkins
چکیده

Motivation. In the past, molecular similarity spaces have been developed from arbitrary sets of molecular properties or theoretical descriptors and the results of property estimation based on these methods have always been inferior to SAR and QSAR models. Tailored QMSA methods attempt to create similarity spaces specific for a property of interest, rather than being purely arbitrary spaces characterizing the general aspects of all chemicals within the space or intuitively selected structure spaces whose elements are chosen subjectively. To this end, we have created three similarity spaces, two tailored and one non–tailored, for a set of 166 chemicals for which we have both log P and normal boiling point (BP) data. The tailored spaces were each tailored to one of the properties, while the other similarity space was developed using standard QMSA methods. Method. Ridge regression was used to determine which of the available molecular descriptors were most useful in modeling each of the available properties. Fifteen topological descriptors were selected for use as dimensions within each the tailored similarity spaces. The same number of principal components were developed using principal component analysis for the arbitrary similarity space. Results. The log P tailored similarity space was superior to both the arbitrary structure space and the BP tailored space for the estimation of log P. Also, the BP tailored similarity space was superior to the arbitrary structure space for the estimation of BP. Interestingly, the space tailored to model log P performed as well at modeling BP as did the BP tailored space. This unexpected result is explained by the degree of overlap between the indices used in both of the tailored spaces and in the presence of connectivity indices related to BP in the log P model. Conclusions. The tailored similarity method presents a promising approach to creating property specific similarity spaces derived from structural descriptors based on the results of this study and from a previous study. Further work is necessary to determine to true utility of this method with large, diverse data sets.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Similarity methods in analog selection, property estimation and clustering of diverse chemicals

This account summarizes Dr. Subhash Basak’s work in the field of molecular similarity. In particular, it looks at the development and application of quantitative molecular similarity analysis (QMSA) techniques using physicochemical properties, topological indices, and atom pairs as descriptors for developing structureor property-based similarity spaces and the use of a k-nearest neighbors (kNN)...

متن کامل

Prediction of the changes in physicochemical properties of key lime juice during IR thermal processing by artificial neural networks

Thermal processing of the key lime juice leads to the inactivation of pectin methylesterase (PME) and the degradation of ascorbic acid (AA). These changes affect directly the cloud stability and color of the juice. In this study, an artificial neural network (ANN) model was applied for designing and developing an intelligent system for prediction of the thermal processing effects on the physico...

متن کامل

Prediction of In Silico ADME Properties of 1,2-O-Isopropylidene Aldohexose Derivatives

Retention behavior of molecules mostly depends on their chemical structure. Retention data of biologically active molecules could be an indirect relationship between their structure and biological or pharmacological activity, since the molecular structure affects their behavior in all pharmacokinetic stages. In the present paper, retention parameters (RM0) of biologically active 1,2-O-isopropyl...

متن کامل

Prediction of In Silico ADME Properties of 1,2-O-Isopropylidene Aldohexose Derivatives

Retention behavior of molecules mostly depends on their chemical structure. Retention data of biologically active molecules could be an indirect relationship between their structure and biological or pharmacological activity, since the molecular structure affects their behavior in all pharmacokinetic stages. In the present paper, retention parameters (RM0) of biologically active 1,2-O-isopropyl...

متن کامل

Bioinformatic Analysis of L-Asparaginase II from Citrobacter Freundii 1101, Erwinia Chrysanthemi DSM 4610, E. coli BL21 and Klebsiella Pneumoniae ATCC 10031

Backgroung and Aims: L-Asparaginase II is a cornerstone of treatment protocols for acute lymphoblastic leukemia. Only asparaginase II obtained from E. coli K12 and Erwinia chrysanthemi have been used in human as therapeutic drug. The therapeutic effects of asparaginase II from E. coli K12 and Erwinia chrysanthemi is accompanied by side effects. It is desirable to search for other asparaginase I...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2002